{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Community Detection with NetworKit "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In this notebook we will cover some community detection algorithms implemented in the `community` module of NetworKit. Community detection is concerned with identifying groups of nodes which are significantly more densely connected to each other than to the rest of the network. As a first step we import NetworKit:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import networkit as nk"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The `community` module provides a top-level function, [detectCommunities(G, algo=None, inspect=True)](https://networkit.github.io/dev-docs/python_api/community.html?highlight=detect#networkit.community.detectCommunities) to perform community detection of a given graph with a suitable algorithm, and print some statistics about the result. If no algorithm is specified via the `algo` parameter, community detection is performed using the [PLM](https://networkit.github.io/dev-docs/python_api/community.html?highlight=plm#networkit.community.PLM) algorithm.\n",
    "\n",
    "This function can be used as follows:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Read graph\n",
    "G = nk.readGraph(\"../input/karate.graph\", nk.Format.METIS)\n",
    "communities = nk.community.detectCommunities(G)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The following sections cover two popular community detection algorithms, `PLM` and `PLP`, and will illustrate how to use them."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## PLM"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "NetworKit provides a parallel implementation of the well-known Louvain method, which can be found in the [PLM](https://networkit.github.io/dev-docs/python_api/community.html?highlight=plm#networkit.community.PLM) class. It yields a high-quality solution at reasonably fast running times. The constructor `PLM(Graph, refine=False, gamma=0.1, par='balance', maxIter=32, turbo=True, recurse=True)` expects a [networkit.Graph](https://networkit.github.io/dev-docs/python_api/networkit.html?highlight=graph#networkit.Graph) as a mandatory parameter. If the parameter `refine` is set to true, the algorithm performs a second move phase to refine the communities. The parameter `gamma` defines the multi-resolution modularity parameter. The string `par` defines the openmp parallelization strategy. `maxIter` is the maximum number of iterations for move phase. When `turbo` is set to true, the algorithm is faster but uses O(n) additional memory per thread. Set `recurse`to true in order to use recursive coarsening. Refer to [this]( http://journals.aps.org/pre/abstract/10.1103/PhysRevE.89.049902) for more details on recursive coarsening.\n",
    "\n",
    "In the example below we run PLM with `refine` set to true while leaving the rest of the parameters to their default values.\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Choose and initialize algorithm \n",
    "plmCommunities = nk.community.detectCommunities(G, algo=nk.community.PLM(G, True))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The output of the `detectCommunities` function is a partition of the nodes of the graph. It is represented by the [Partition](https://networkit.github.io/dev-docs/python_api/networkit.html?highlight=partition#networkit.Partition) data structure, which provides several methods for inspecting and manipulating a partition of a set of elements."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(\"{0} elements assigned to {1} subsets\".format(plmCommunities.numberOfElements(),\n",
    "                                                    plmCommunities.numberOfSubsets()))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(\"the biggest subset has size {0}\".format(max(plmCommunities.subsetSizes())))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The contents of a partition object can be written to file in a simple format, in which the `i`-th line contains an integer representing the subset id of node `i`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "nk.community.writeCommunities(plmCommunities, \"output/communtiesPLM.partition\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## PLP "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The Label Propagation algorithm is an algorithm for finding communities in a graph. NetworKit provides a parallel implementation, [PLP(G, updateThreshold=none, maxIterations=none)](https://networkit.github.io/dev-docs/python_api/community.html?highlight=plp#networkit.community.PLP). The constructor expects a [networkit.Graph](https://networkit.github.io/dev-docs/python_api/networkit.html?highlight=graph#networkit.Graph) as a mandatory parameter. The parameter `updateThreshold` dictates the number of nodes that have to be changed in each iteration so that a new iteration starts, and `maxIterations` is the maximum number of iterations. `none` is NetworKit constant set to the maximum value of a 64-bit integer."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Read graph\n",
    "G = nk.readGraph(\"../input/jazz.graph\", nk.Format.METIS)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Choose and initialize algorithm \n",
    "plpCommunities = nk.community.detectCommunities(G, algo=nk.community.PLP(G))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(\"{0} elements assigned to {1} subsets\".format(plpCommunities.numberOfElements(),\n",
    "                                                    plpCommunities.numberOfSubsets()))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(\"the biggest subset has size {0}\".format(max(plpCommunities.subsetSizes())))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "nk.community.writeCommunities(plpCommunities, \"output/communtiesPLP.partition\")"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
